Chunk and Clause Identification for Basque by Filtering and Ranking with Perceptrons
نویسندگان
چکیده
This paper presents systems for syntactic chunking and clause identification for Basque, combining rule-based grammars with machine-learning techniques. Precisely, we used Filtering-Ranking with Perceptrons (Carreras, Màrquez and Castro, 2005): a learning model that recognizes partial syntactic structures in sentences, obtaining state-of-the-art performance for these tasks in English. This model allows incorporating a rich set of features to represent syntactic phrases, making possible to use information from different sources. We used this property in order to include more linguistic features in the learning model and the results obtained in chunking have been improved greatly. This way, we have made up for the relatively small training data available for Basque to learn a chunking model. In the case of clause identification, our preliminary results are low, which suggest that this is due to the free order of Basque and to the small corpus available.
منابع مشابه
Online Learning via Global Feedback for Phrase Recognition
We present a system to recognize phrases based on perceptrons, and a global online learning algorithm to train them together. The recognition strategy applies learning in two layers: a filtering layer, which reduces the search space by identifying plausible phrase candidates, and a ranking layer, which discriminatively builds the optimal phrase structure. We provide a recognition-based feedback...
متن کاملBasque Functional Heads
Bill Haddican NYU 6/18/2001 0. Introduction This paper makes three claims about Basque grammar. First, it argues that Cinque’s (1999) hierarchy of functional heads largely holds for Basque. In a typical pattern, lower morphemes in the hierarchy appear in the reverse order, while higher morphemes appear in Cinque’s order. This is explained through roll-up—iterative XP movement through specifier ...
متن کاملMultiGranCNN: An Architecture for General Matching of Text Chunks on Multiple Levels of Granularity
We present MultiGranCNN, a general deep learning architecture for matching text chunks. MultiGranCNN supports multigranular comparability of representations: shorter sequences in one chunk can be directly compared to longer sequences in the other chunk. MultiGranCNN also contains a flexible and modularized match feature component that is easily adaptable to different types of chunk matching. We...
متن کاملIdentification and ranking risks of horizontal directional drilling for oil & gas wells by using fuzzy analytic network process, a case study for Gachsaran oil field wells
Risk ranking of Horizontal Directional Drilling (HDD) for gas and oil wells is a key criterion in the project feasibility, pricing and for introducing a risk management strategy that aims to reduce the number of failures in the installation phase and its negative consequences. HDD is currently widely used in drilling wells in Iran, but research in the area of identification and risks ranking of...
متن کاملPhrase recognition by filtering and ranking with perceptrons
We present a phrase recognition system based on perceptrons, and an online learning algorithm to train them together. The recognition strategy applies learning in two layers, first at word level, to filter words and form phrase candidates, second at phrase level, to rank phrases and select the optimal ones. We provide a global feedback rule which reflects the dependencies among perceptrons and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Procesamiento del Lenguaje Natural
دوره 41 شماره
صفحات -
تاریخ انتشار 2008